We present an approach to combining distributional semantic representationsinduced from text corpora with manually constructed lexical-semantic networks.While both kinds of semantic resources are available with high lexicalcoverage, our aligned resource combines the domain specificity and availabilityof contextual information from distributional models with the conciseness andhigh quality of manually crafted lexical networks. We start with adistributional representation of induced senses of vocabulary terms, which areaccompanied with rich context information given by related lexical items. Wethen automatically disambiguate such representations to obtain a full-fledgedproto-conceptualization, i.e. a typed graph of induced word senses. In a finalstep, this proto-conceptualization is aligned to a lexical ontology, resultingin a hybrid aligned resource. Moreover, unmapped induced senses are associatedwith a semantic type in order to connect them to the core resource. Manualevaluations against ground-truth judgments for different stages of our methodas well as an extrinsic evaluation on a knowledge-based Word SenseDisambiguation benchmark all indicate the high quality of the new hybridresource. Additionally, we show the benefits of enriching top-down lexicalknowledge resources with bottom-up distributional information from text foraddressing high-end knowledge acquisition tasks such as cleaning hypernymgraphs and learning taxonomies from scratch.
展开▼